307 research outputs found

    The evaluation of protein folding rate constant is improved by predicting the folding kinetic order with a SVM-based method

    Full text link
    Protein folding is a problem of large interest since it concerns the mechanism by which the genetic information is translated into proteins with well defined three-dimensional (3D) structures and functions. Recently theoretical models have been developed to predict the protein folding rate considering the relationships of the process with tolopological parameters derived from the native (atomic-solved) protein structures. Previous works classified proteins in two different groups exhibiting either a single-exponential or a multi-exponential folding kinetics. It is well known that these two classes of proteins are related to different protein structural features. The increasing number of available experimental kinetic data allows the application to the problem of a machine learning approach, in order to predict the kinetic order of the folding process starting from the experimental data so far collected. This information can be used to improve the prediction of the folding rate. In this work first we describe a support vector machine-based method (SVM-KO) to predict for a given protein the kinetic order of the folding process. Using this method we can classify correctly 78% of the folding mechanisms over a set of 63 experimental data. Secondly we focus on the prediction of the logarithm of the folding rate. This value can be obtained as a linear regression task with a SVM-based method. In this paper we show that linear correlation of the predicted with experimental data can improve when the regression task is computed over two different sets, instead of one, each of them composed by the proteins with a correctly predicted two state or multistate kinetic order.Comment: The paper will be published on WSEAS Transaction on Biology and Biomedicin

    The 4th Bologna Winter School: Hot Topics in Structural Genomics

    Get PDF
    The 4th Bologna Winter School on Biotechnologies was held on 9–15 February 2003 at the University of Bologna, Italy, with the specific aim of discussing recent developments in bioinformatics. The school provided an opportunity for students and scientists to debate current problems in computational biology and possible solutions. The course, co-supported (as last year) by the European Science Foundation program on Functional Genomics, focused mainly on hot topics in structural genomics, including recent CASP and CAPRI results, recent and promising genomewide predictions, protein–protein and protein–DNA interaction predictions and genome functional annotation. The topics were organized into four main sections (http://www.biocomp.unibo.it)

    The posterior-Viterbi: a new decoding algorithm for hidden Markov models

    Full text link
    Background: Hidden Markov models (HMM) are powerful machine learning tools successfully applied to problems of computational Molecular Biology. In a predictive task, the HMM is endowed with a decoding algorithm in order to assign the most probable state path, and in turn the class labeling, to an unknown sequence. The Viterbi and the posterior decoding algorithms are the most common. The former is very efficient when one path dominates, while the latter, even though does not guarantee to preserve the automaton grammar, is more effective when several concurring paths have similar probabilities. A third good alternative is 1-best, which was shown to perform equal or better than Viterbi. Results: In this paper we introduce the posterior-Viterbi (PV) a new decoding which combines the posterior and Viterbi algorithms. PV is a two step process: first the posterior probability of each state is computed and then the best posterior allowed path through the model is evaluated by a Viterbi algorithm. Conclusions: We show that PV decoding performs better than other algorithms first on toy models and then on the computational biological problem of the prediction of the topology of beta-barrel membrane proteins.Comment: 23 pages, 3 figure

    SChloro: directing Viridiplantae proteins to six chloroplastic sub-compartments

    Get PDF
    Motivation: Chloroplasts are organelles found in plants and involved in several important cell processes. Similarly to other compartments in the cell, chloroplasts have an internal structure comprising several sub-compartments, where different proteins are targeted to perform their functions. Given the relation between protein function and localization, the availability of effective computational tools to predict protein sub-organelle localizations is crucial for large-scale functional studies. Results: In this paper we present SChloro, a novel machine-learning approach to predict protein sub-chloroplastic localization, based on targeting signal detection and membrane protein information. The proposed approach performs multi-label predictions discriminating six chloroplastic sub-compartments that include inner membrane, outer membrane, stroma, thylakoid lumen, plastoglobule and thylakoid membrane. In comparative benchmarks, the proposed method outperforms current state-of-the-art methods in both single-and multi-compartment predictions, with an overall multi-label accuracy of 74%. The results demonstrate the relevance of the approach that is eligible as a good candidate for integration into more general large-scale annotation pipelines of protein subcellular localization

    NET-GE: a novel NETwork-based Gene Enrichment for detecting biological processes associated to Mendelian diseases

    Get PDF
    Enrichment analysis is a widely applied procedure for shedding light on the molecular mechanisms and functions at the basis of phenotypes, for enlarging the dataset of possibly related genes/proteins and for helping interpretation and prioritization of newly determined variations. Several standard and Network-based enrichment methods are available. Both approaches rely on the annotations that characterize the genes/proteins included in the input set; network based ones also include in different ways physical and functional relationships among different genes or proteins that can be extracted from the available biological networks of interactions

    In silico evidence of the relationship between miRNAs and siRNAs

    Full text link
    Both short interfering RNAs (siRNAs) and microRNAs (miRNAs) mediate the repression of specific sequences of mRNA through the RNA interference pathway. In the last years several experiments have supported the hypothesis that siRNAs and miRNAs may be functionally interchangeable, at least in cultured cells. In this work we verify that this hypothesis is also supported by a computational evidence. We show that a method specifically trained to predict the activity of the exogenous siRNAs assigns a high silencing level to experimentally determined human miRNAs. This result not only supports the idea of siRNAs and miRNAs equivalence but indicates that it is possible to use computational tools developed using synthetic small interference RNAs to investigate endogenous miRNAs.Comment: 8 pages, 2 figure

    BUSCA: An integrative web server to predict subcellular localization of proteins

    Get PDF
    Here, we present BUSCA (http://busca.biocomp.unibo.it), a novel web server that integrates different computational tools for predicting protein subcellular localization. BUSCA combines methods for identifying signal and transit peptides (DeepSig and TPpred3), GPI-anchors (PredGPI) and transmembrane domains (ENSEMBLE3.0 and BetAware) with tools for discriminating subcellular localization of both globular and membrane proteins (BaCelLo, MemLoci and SChloro). Outcomes from the different tools are processed and integrated for annotating subcellular localization of both eukaryotic and bacterial protein sequences. We benchmark BUSCA against protein targets derived from recent CAFA experiments and other specific data sets, reporting performance at the state-of-the-art. BUSCA scores better than all other evaluated methods on 2732 targets from CAFA2, with a F1 value equal to 0.49 and among the best methods when predicting targets from CAFA3. We propose BUSCA as an integrated and accurate resource for the annotation of protein subcellular localization

    I-Mutant2.0: predicting stability changes upon mutation from the protein sequence or structure

    Get PDF
    I-Mutant2.0 is a support vector machine (SVM)-based tool for the automatic prediction of protein stability changes upon single point mutations. I-Mutant2.0 predictions are performed starting either from the protein structure or, more importantly, from the protein sequence. This latter task, to the best of our knowledge, is exploited for the first time. The method was trained and tested on a data set derived from ProTherm, which is presently the most comprehensive available database of thermodynamic experimental data of free energy changes of protein stability upon mutation under different conditions. I-Mutant2.0 can be used both as a classifier for predicting the sign of the protein stability change upon mutation and as a regression estimator for predicting the related ΔΔG values. Acting as a classifier, I-Mutant2.0 correctly predicts (with a cross-validation procedure) 80% or 77% of the data set, depending on the usage of structural or sequence information, respectively. When predicting ΔΔG values associated with mutations, the correlation of predicted with expected/experimental values is 0.71 (with a standard error of 1.30 kcal/mol) and 0.62 (with a standard error of 1.45 kcal/mol) when structural or sequence information are respectively adopted. Our web interface allows the selection of a predictive mode that depends on the availability of the protein structure and/or sequence. In this latter case, the web server requires only pasting of a protein sequence in a raw format. We therefore introduce I-Mutant2.0 as a unique and valuable helper for protein design, even when the protein structure is not yet known with atomic resolution. Availability:

    Large scale analysis of protein stability in OMIM disease related human protein variants

    Get PDF
    Modern genomic techniques allow to associate several Mendelian human diseases to single residue variations in different proteins. Molecular mechanisms explaining the relationship among genotype and phenotype are still under debate. Change of protein stability upon variation appears to assume a particular relevance in annotating whether a single residue substitution can or cannot be associated to a given disease. Thermodynamic properties of human proteins and of their disease related variants are lacking. In the present work, we take advantage of the available three dimensional structure of human proteins for predicting the role of disease related variations on the perturbation of protein stability

    PredGPI: a GPI-anchor predictor

    Get PDF
    Background Several eukaryotic proteins associated to the extracellular leaflet of the plasma membrane carry a Glycosylphosphatidylinositol (GPI) anchor, which is linked to the C-terminal residue after a proteolytic cleavage occurring at the so called ω-site. Computational methods were developed to discriminate proteins that undergo this post-translational modification starting from their aminoacidic sequences. However more accurate methods are needed for a reliable annotation of whole proteomes. Results Here we present PredGPI, a prediction method that, by coupling a Hidden Markov Model (HMM) and a Support Vector Machine (SVM), is able to efficiently predict both the presence of the GPI-anchor and the position of the ω-site. PredGPI is trained on a non-redundant dataset of experimentally characterized GPI-anchored proteins whose annotation was carefully checked in the literature. Conclusion PredGPI outperforms all the other previously described methods and is able to correctly replicate the results of previously published high-throughput experiments. PredGPI reaches a lower rate of false positive predictions with respect to other available methods and it is therefore a costless, rapid and accurate method for screening whole proteomes
    • …
    corecore